tactile reading
FBI: Learning Dexterous In-hand Manipulation with Dynamic Visuotactile Shortcut Policy
Chen, Yijin, Xu, Wenqiang, Yu, Zhenjun, Tang, Tutian, Li, Yutong, Yao, Siqiong, Lu, Cewu
Figure 1: We propose Flow Before Imitation (FBI), a novel dynamic visuotactile imitation learning algorithm for dexterous in-hand manipulation. FBI's design enables two operational modes: with or without physical tactile sensors in the real world, largely extending the application scenarios. Abstract -- Dexterous in-hand manipulation is a long-standing challenge in robotics due to complex contact dynamics and partial observability. This paper introduces Flow Before Imitation (FBI), a visuotactile imitation learning framework that dynamically fuses tactile interactions with visual observations through motion dynamics. Unlike prior static fusion methods, FBI establishes a causal link between tactile signals and object motion via a dynamics-aware latent model. FBI employs a transformer-based interaction module to fuse flow-derived tactile features with visual inputs, training a one-step diffusion policy for real-time execution. Extensive experiments demonstrate that the proposed method outperforms the baseline methods in both simulation and the real world on two customized in-hand manipulation tasks and three standard dexterous manipulation tasks.
Zero-shot Sim2Real Transfer for Magnet-Based Tactile Sensor on Insertion Tasks
Han, Beining, Joshi, Abhishek, Deng, Jia
Tactile sensing is an important sensing modality for robot manipulation. Among different types of tactile sensors, magnet-based sensors, like u-skin, balance well between high durability and tactile density. However, the large sim-to-real gap of tactile sensors prevents robots from acquiring useful tactile-based manipulation skills from simulation data, a recipe that has been successful for achieving complex and sophisticated control policies. Prior work has implemented binarization techniques to bridge the sim-to-real gap for dexterous in-hand manipulation. However, binarization inherently loses much information that is useful in many other tasks, e.g., insertion. In our work, we propose GCS, a novel sim-to-real technique to learn contact-rich skills with dense, distributed, 3-axis tactile readings. We evaluate our approach on blind insertion tasks and show zero-shot sim-to-real transfer of RL policies with raw tactile reading as input.
Learning Robust Grasping Strategy Through Tactile Sensing and Adaption Skill
Hu, Yueming, Li, Mengde, Yang, Songhua, Li, Xuetao, Liu, Sheng, Li, Miao
Abstract-- Robust grasping represents an essential task in robotics, necessitating tactile feedback and reactive grasping adjustments for robust grasping of objects. Previous research has extensively combined tactile sensing with grasping, primarily relying on rule-based approaches, frequently neglecting post-grasping difficulties such as external disruptions or inherent uncertainties of the object's physics and geometry. To address these limitations, this paper introduces an humandemonstration-based adaptive grasping policy base on tactile, which aims to achieve robust gripping while resisting disturbances to maintain grasp stability. Our trained model generalizes to daily objects with seven different sizes, shapes, and textures. Experimental results demonstrate that our method performs well in dynamic and force interaction tasks and exhibits excellent generalization ability.
- Asia > China > Hubei Province > Wuhan (0.05)
- Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)
Self-Supervised Visuo-Tactile Pretraining to Locate and Follow Garment Features
Kerr, Justin, Huang, Huang, Wilcox, Albert, Hoque, Ryan, Ichnowski, Jeffrey, Calandra, Roberto, Goldberg, Ken
Humans make extensive use of vision and touch as complementary senses, with vision providing global information about the scene and touch measuring local information during manipulation without suffering from occlusions. While prior work demonstrates the efficacy of tactile sensing for precise manipulation of deformables, they typically rely on supervised, human-labeled datasets. We propose Self-Supervised Visuo-Tactile Pretraining (SSVTP), a framework for learning multi-task visuo-tactile representations in a self-supervised manner through cross-modal supervision. We design a mechanism that enables a robot to autonomously collect precisely spatially-aligned visual and tactile image pairs, then train visual and tactile encoders to embed these pairs into a shared latent space using cross-modal contrastive loss. We apply this latent space to downstream perception and control of deformable garments on flat surfaces, and evaluate the flexibility of the learned representations without fine-tuning on 5 tasks: feature classification, contact localization, anomaly detection, feature search from a visual query (e.g., garment feature localization under occlusion), and edge following along cloth edges. The pretrained representations achieve a 73-100% success rate on these 5 tasks.
- North America > United States (0.14)
- Europe > Germany (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects
Gao, Ruohan, Dou, Yiming, Li, Hao, Agarwal, Tanmay, Bohg, Jeannette, Li, Yunzhu, Fei-Fei, Li, Wu, Jiajun
We introduce the ObjectFolder Benchmark, a benchmark suite of 10 tasks for multisensory object-centric learning, centered around object recognition, reconstruction, and manipulation with sight, sound, and touch. We also introduce the ObjectFolder Real dataset, including the multisensory measurements for 100 real-world household objects, building upon a newly designed pipeline for collecting the 3D meshes, videos, impact sounds, and tactile readings of real-world objects. We conduct systematic benchmarking on both the 1,000 multisensory neural objects from ObjectFolder, and the real multisensory data from ObjectFolder Real. Our results demonstrate the importance of multisensory perception and reveal the respective roles of vision, audio, and touch for different object-centric learning tasks. By publicly releasing our dataset and benchmark suite, we hope to catalyze and enable new research in multisensory object-centric learning in computer vision, robotics, and beyond. Project page: https://objectfolder.stanford.edu
- North America > United States > California > Santa Clara County > Palo Alto (0.24)
- Asia > Singapore (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Robots > Manipulation (0.68)
RobotSweater: Scalable, Generalizable, and Customizable Machine-Knitted Tactile Skins for Robots
Si, Zilin, Yu, Tianhong Catherine, Morozov, Katrene, McCann, James, Yuan, Wenzhen
Tactile sensing is essential for robots to perceive and react to the environment. However, it remains a challenge to make large-scale and flexible tactile skins on robots. Industrial machine knitting provides solutions to manufacture customizable fabrics. Along with functional yarns, it can produce highly customizable circuits that can be made into tactile skins for robots. In this work, we present RobotSweater, a machine-knitted pressure-sensitive tactile skin that can be easily applied on robots. We design and fabricate a parameterized multi-layer tactile skin using off-the-shelf yarns, and characterize our sensor on both a flat testbed and a curved surface to show its robust contact detection, multi-contact localization, and pressure sensing capabilities. The sensor is fabricated using a well-established textile manufacturing process with a programmable industrial knitting machine, which makes it highly customizable and low-cost. The textile nature of the sensor also makes it easily fit curved surfaces of different robots and have a friendly appearance. Using our tactile skins, we conduct closed-loop control with tactile feedback for two applications: (1) human lead-through control of a robot arm, and (2) human-robot interaction with a mobile robot.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
MAT: Multi-Fingered Adaptive Tactile Grasping via Deep Reinforcement Learning
Wu, Bohan, Akinola, Iretiayo, Varley, Jacob, Allen, Peter
Vision-based grasping systems typically adopt an open-loop execution of a planned grasp. This policy can fail due to many reasons, including ubiquitous calibration error. Recovery from a failed grasp is further complicated by visual occlusion, as the hand is usually occluding the vision sensor as it attempts another open-loop regrasp. This work presents MAT, a tactile closed-loop method capable of realizing grasps provided by a coarse initial positioning of the hand above an object. Our algorithm is a deep reinforcement learning (RL) policy optimized through the clipped surrogate objective within a maximum entropy RL framework to balance exploitation and exploration. The method utilizes tactile and proprioceptive information to act through both fine finger motions and larger regrasp movements to execute stable grasps. A novel curriculum of action motion magnitude makes learning more tractable and helps turn common failure cases into successes. Careful selection of features that exhibit small sim-to-real gaps enables this tactile grasping policy, trained purely in simulation, to transfer well to real world environments without the need for additional learning. Experimentally, this methodology improves over a vision-only grasp success rate substantially on a multi-fingered robot hand. When this methodology is used to realize grasps from coarse initial positions provided by a vision-only planner, the system is made dramatically more robust to calibration errors in the camera-robot transform.
- North America > United States (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Learning to Identify Object Instances by Touch: Tactile Recognition via Multimodal Matching
Lin, Justin, Calandra, Roberto, Levine, Sergey
Much of the literature on robotic perception focuses on the visual modality. Vision provides a global observation of a scene, making it broadly useful. However, in the domain of robotic manipulation, vision alone can sometimes prove inadequate: in the presence of occlusions or poor lighting, visual object identification might be difficult. The sense of touch can provide robots with an alternative mechanism for recognizing objects. In this paper, we study the problem of touch-based instance recognition. We propose a novel framing of the problem as multi-modal recognition: the goal of our system is to recognize, given a visual and tactile observation, whether or not these observations correspond to the same object. To our knowledge, our work is the first to address this type of multi-modal instance recognition problem on such a large-scale with our analysis spanning 98 different objects. We employ a robot equipped with two GelSight touch sensors, one on each finger, and a self-supervised, autonomous data collection procedure to collect a dataset of tactile observations and images. Our experimental results show that it is possible to accurately recognize object instances by touch alone, including instances of novel objects that were never seen during training. Our learned model outperforms other methods on this complex task, including that of human volunteers.
- North America > United States > California > Alameda County > Berkeley (0.05)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- Health & Medicine (0.46)
- Consumer Products & Services (0.46)